AITopics | Mediterranean Sea

GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction

Neural Information Processing SystemsMar-27-2025, 13:26:51 GMT

Representing 3D scenes from multiview images remains a core challenge in computer vision and graphics, requiring both reliable rendering and reconstruction, which often conflicts due to the mismatched prioritization of image quality over precise underlying scene geometry. Although both neural implicit surfaces and explicit Gaussian primitives have advanced with neural rendering techniques, current methods impose strict constraints on density fields or primitive shapes, which enhances the affinity for geometric reconstruction at the sacrifice of rendering quality. To address this dilemma, we introduce GSDF, a dual-branch architecture combining 3D Gaussian Splatting (3DGS) and neural Signed Distance Fields (SDF). Our approach leverages mutual guidance and joint supervision during the training process to mutually enhance reconstruction and rendering. Specifically, our method guides the Gaussian primitives to locate near potential surfaces and accelerates the SDF convergence. This implicit mutual guidance ensures robustness and accuracy in both synthetic and real-world scenarios. Experimental results demonstrate that our method boosts the SDF optimization process to reconstruct more detailed geometry, while reducing floaters and blurry edge artifacts in rendering by aligning Gaussian primitives with the underlying geometry.

artificial intelligence, gaussian primitive, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.28)
Asia > Middle East > Israel > Mediterranean Sea (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation Jungyoon Lee 1 Hannes Stärk 3

Neural Information Processing SystemsMar-27-2025, 13:16:46 GMT

Predicting low-energy molecular conformations given a molecular graph is an important but challenging task in computational drug discovery. Existing stateof-the-art approaches either resort to large scale transformer-based models that diffuse over conformer fields, or use computationally expensive methods to generate initial structures and diffuse over torsion angles. In this work, we introduce Equivariant Transformer Flow (ET-Flow).

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Mediterranean Sea (0.24)
North America > Canada > Quebec (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

VFIMamba: Video Frame Interpolation with State Space Models

Neural Information Processing SystemsMar-27-2025, 06:46:56 GMT

Inter-frame modeling is pivotal in generating intermediate frames for video frame interpolation (VFI). Current approaches predominantly rely on convolution or attention-based models, which often either lack sufficient receptive fields or entail significant computational overheads. Recently, Selective State Space Models (S6) have emerged, tailored specifically for long sequence modeling, offering both linear complexity and data-dependent modeling capabilities. In this paper, we propose VFIMamba, a novel frame interpolation method for efficient and dynamic inter-frame modeling by harnessing the S6 model. Our approach introduces the Mixed-SSM Block (MSB), which initially rearranges tokens from adjacent frames in an interleaved fashion and subsequently applies multi-directional S6 modeling.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Mediterranean Sea (0.24)
Asia > China (0.14)
Europe > Germany (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Comprehensive Manuscript Assessment with Text Summarization Using 69707 articles

Sun, Qichen, Lu, Yuxing, Xia, Kun, Chen, Li, Sun, He, Wang, Jinzhuo

arXiv.org Artificial IntelligenceMar-26-2025

Rapid and efficient assessment of the future impact of research articles is a significant concern for both authors and reviewers. The most common standard for measuring the impact of academic papers is the number of citations. In recent years, numerous efforts have been undertaken to predict citation counts within various citation windows. However, most of these studies focus solely on a specific academic field or require early citation counts for prediction, rendering them impractical for the early-stage evaluation of papers. In this work, we harness Scopus to curate a significantly comprehensive and large-scale dataset of information from 69707 scientific articles sourced from 99 journals spanning multiple disciplines. We propose a deep learning methodology for the impact-based classification tasks, which leverages semantic features extracted from the manuscripts and paper metadata. To summarize the semantic features, such as titles and abstracts, we employ a Transformer-based language model to encode semantic features and design a text fusion layer to capture shared information between titles and abstracts. We specifically focus on the following impact-based prediction tasks using information of scientific manuscripts in pre-publication stage: (1) The impact of journals in which the manuscripts will be published. (2) The future impact of manuscripts themselves. Extensive experiments on our datasets demonstrate the superiority of our proposed model for impact-based prediction tasks. We also demonstrate potentials in generating manuscript's feedback and improvement suggestions.

citation count, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.20835

Country: Asia > Middle East > Israel > Mediterranean Sea (0.24)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Joint Extraction Matters: Prompt-Based Visual Question Answering for Multi-Field Document Information Extraction

Loem, Mengsay, Hosaka, Taiju

arXiv.org Artificial IntelligenceMar-21-2025

Visual question answering (VQA) has emerged as a flexible approach for extracting specific pieces of information from document images. However, existing work typically queries each field in isolation, overlooking potential dependencies across multiple items. This paper investigates the merits of extracting multiple fields jointly versus separately. Through experiments on multiple large vision language models and datasets, we show that jointly extracting fields often improves accuracy, especially when the fields share strong numeric or contextual dependencies. We further analyze how performance scales with the number of requested items and use a regression based metric to quantify inter field relationships. Our results suggest that multi field prompts can mitigate confusion arising from similar surface forms and related numeric values, providing practical methods for designing robust VQA systems in document information extraction tasks.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.16868

Country:

North America > United States > Texas (0.14)
North America > Mexico > Mexico City (0.14)
Asia > Middle East > Israel > Mediterranean Sea (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.91)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)

Add feedback

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Neural Information Processing SystemsMar-20-2025, 20:59:44 GMT

Language model (LM) agents are increasingly being used to automate complicated tasks in digital environments. Just as humans benefit from powerful software applications, such as integrated development environments, for complex tasks like software engineering, we posit that LM agents represent a new category of end users with their own needs and abilities, and would benefit from specially-built interfaces to the software they use. We investigate how interface design affects the performance of language model agents. As a result of this exploration, we introduce SWE-agent: a system that facilitates LM agents to autonomously use computers to solve software engineering tasks. SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs. We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively, far exceeding the previous state-of-the-art achieved with non-interactive LMs. Finally, we provide insight on how the design of the ACI can impact agents' behavior and performance.

large language model, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.13)
North America > United States > California > Santa Clara County (0.13)
Asia > Middle East > Israel > Mediterranean Sea (0.13)

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.92)
Overview (0.92)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Add feedback

Torsional Geometric Generation of Molecular 3D Conformer Ensembles

Neural Information Processing SystemsMar-19-2025, 20:06:11 GMT

Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements (e.g., torsion angles), separate optimization stages prone to error accumulation, and the need for structure fine-tuning based on approximate classical force-fields or computationally expensive methods.

artificial intelligence, conformer, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel > Mediterranean Sea (0.24)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction

Neural Information Processing SystemsMar-17-2025, 23:18:45 GMT

Representing 3D scenes from multiview images remains a core challenge in computer vision and graphics, requiring both reliable rendering and reconstruction, which often conflicts due to the mismatched prioritization of image quality over precise underlying scene geometry. Although both neural implicit surfaces and explicit Gaussian primitives have advanced with neural rendering techniques, current methods impose strict constraints on density fields or primitive shapes, which enhances the affinity for geometric reconstruction at the sacrifice of rendering quality. To address this dilemma, we introduce GSDF, a dual-branch architecture combining 3D Gaussian Splatting (3DGS) and neural Signed Distance Fields (SDF). Our approach leverages mutual guidance and joint supervision during the training process to mutually enhance reconstruction and rendering. Specifically, our method guides the Gaussian primitives to locate near potential surfaces and accelerates the SDF convergence.

artificial intelligence, improved neural rendering and reconstruction, machine learning, (4 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel > Mediterranean Sea (0.29)

Technology:

Information Technology > Artificial Intelligence > Vision (0.63)
Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

VFIMamba: Video Frame Interpolation with State Space Models

Neural Information Processing SystemsMar-17-2025, 18:30:21 GMT

Inter-frame modeling is pivotal in generating intermediate frames for video frame interpolation (VFI). Current approaches predominantly rely on convolution or attention-based models, which often either lack sufficient receptive fields or entail significant computational overheads. Recently, Selective State Space Models (S6) have emerged, tailored specifically for long sequence modeling, offering both linear complexity and data-dependent modeling capabilities. In this paper, we propose VFIMamba, a novel frame interpolation method for efficient and dynamic inter-frame modeling by harnessing the S6 model. Our approach introduces the Mixed-SSM Block (MSB), which initially rearranges tokens from adjacent frames in an interleaved fashion and subsequently applies multi-directional S6 modeling.

artificial intelligence, state space model, video frame interpolation, (2 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel > Mediterranean Sea (0.28)

Technology:

Information Technology > Artificial Intelligence > Vision (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.64)

Add feedback

The last "scanner" you'll ever need is now 24.99

Popular ScienceMar-8-2025, 12:00:00 GMT

Make more room on your desk when you swap from that clunky old scanner for the iScanner app. Lifetime access comes with your own iOS scanner, cloud storage, and PDF editing. Usually, it is 199, but you can get this lifetime subscription for 24.99. Stay organized even if you're working on the go by scanning documents into a digital format. This app is ideal for those who work in the field or students moving around campus all day.

artificial intelligence, scanner, subscription, (4 more...)

Popular Science

Country: Asia > Middle East > Israel > Mediterranean Sea (0.26)

Industry: Marketing (0.40)

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

Filters

Collaborating Authors

Mediterranean Sea

GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction

ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation Jungyoon Lee 1 Hannes Stärk 3

VFIMamba: Video Frame Interpolation with State Space Models

Comprehensive Manuscript Assessment with Text Summarization Using 69707 articles

Joint Extraction Matters: Prompt-Based Visual Question Answering for Multi-Field Document Information Extraction

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Torsional Geometric Generation of Molecular 3D Conformer Ensembles

GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction

VFIMamba: Video Frame Interpolation with State Space Models

The last "scanner" you'll ever need is now 24.99